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Abstract 

It is known that the existential theory of equations in free groups 
is decidable. This is a famous result of Makanin. On the other hand 
it has been shown that the scheme of his algorithm is not primitive 
recursive. In this paper we present an algorithm that works in poly- 
nomial space, even in the more general setting where each variable 
has a rational constraint, that is, the solution has to respect a spec- 
ification given by a regular word language. Our main result states 
that the existential theory of equations in free groups with rational 
constraints is PSPACE-complete. We obtain this result as a corollary 
of the corresponding statement about free monoids with involution. 



1 



1 Introduction 



Around the 1980's a great progress was achieved on the algorithmic decid- 
abihty of elementary theories of free monoids and groups. In 1977 Makanin 
|[r?| proved that the existential theory of equations in free monoids is decid- 
able by presenting an algorithm which solves the satisfiability problem for 
a single word equation with constants. In 1983 he extended his result to 
the more complicated framework in free groups |]18[. In fact, using a result 
by Merzlyakov [^] he also showed that the positive theory of equations in 
free groups is decidable |T^, and Razborov was able to give a description of 
the whole solution set [^. The algorithms of Makanin are very complex: 
For word equations the running time was first estimated by several towers 
of exponentials and it took more than 20 years to lower it down to the best 
known bound for Makanin's original algorithm, which is to date EXPSPACE 



0. For solving equations in free groups Koscielski and Pacholski [|14| have 
shown that the scheme proposed by Makanin is not primitive recursive. 
In 1999 Plandowski invented another method for solving word equations and 
he showed that the satisfiability problem for word equations is in PSPACE, 
p6| . One ingredient of his work is to use data compression to reduce the 
exponential space to polynomial space. The importance of data compression 
was first recognized by Rytter and Plandowski when applying Lempel-Ziv 
encodings to the minimal solution of a word equation |27]. Another important 
notion is the definition of an ^-factorization of the solution being explained 
below. Gutierrez extended Plandowski's method to the case of free groups. 



T0| . Thus, a non-primitive recursive scheme for solving equations in free 
groups has been replaced by a polynomial space bounded algorithm. Hagenah 
and Diekert worked independently in the same direction and using some ideas 
of Gutierrez they obtained a result which includes the presence of rational 
constraints. This appeared as extended abstract in |^ and also as a part of 
the PhD-thesis of Hagenah ||rT| . 

The present paper is a journal version of |TD]. It shows that the existential 
theory of equations in free groups with rational constraints is PSPACE- 
complete. Rational constraints mean that a possible solution has to respect a 
specification which is given by a regular word language. The idea to consider 
regular constraints for word equations goes back to Schulz who also 
pointed out the importance of this concept, see also 0, ||. The PSPACE- 
completeness for the case of word equations with regular constraints has been 
stated by Rytter already, as cited in |2^, Thm. 1]. 
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Our proof reduces the case of equations with rational constraints in free 
groups to the case equations with regular constraints in free monoids with 
involution, which turns out to be the central object. (Makanin uses the 
notion of "paired alphabet", but a main difference is that he considered 
"non contractible" solutions only, whereas we deal with general solutions 
and, in addition, we have constraints.) During our work we extend the 



method of such that it copes with the involution and the method of |TG 



such that it copes with rational constraints. The first step is a reduction 
to the satisfiability problem of a single equation with regular constraints in 
a free monoid with involution. In order to avoid an exponential blow-up, 
we do not use a reduction as in ||T^, but a simpler one. In particular, we 



can handle negations simply by positive rational constraints. In the second 
step we show that the satisfiability problem of a single equation with regular 
constraints in a free monoid with involution is still in PSPACE. This part 
is rather technical and we introduce several new notions like base-change, 
projection, partial solution, and free interval. The careful handling of free 
intervals is necessary because of the constraints. In some sense this is the 
only additional difficulty which we will meet when dealing with constraints. 
After these preparations we can follow Plandowski's method. Throughout 
we shall use many of the deep ideas which were presented in ||2^, and apply 
them in a different setting. Hence, as we cannot use Plandowski's result 
as a black box, we have to go through the whole construction again. As a 
result our paper is (involuntarily) self-contained, up to standard knowledge 
in combinatorics on words and linear Diophantine equations. 



2 Free Groups and their Rational Subsets 

Let S be a finite alphabet. By F(S) we denote the free group over S. 
Elements of i^(S) can be represented by words in (S U S)*, where T, = {a \ 
a G S }. We read a as in F(E) and we use the convention that a = a. 
Hence the set F = S U E is equipped with an involution ~ : F — F; the 
involution is extended to F* by ai ■ ■ ■ = ■ ■ -oi for n > and at G F, 
1 < i < n. The empty word as well as the unit element in other monoids is 
denoted by 1. By : T* F{T,) we denote the canonical homomorphism. 
A word tLi G F* is freely reduced, if it contains no factor of the form aa 
with a G F. The reduction of a word G F* can be computed by using the 
Noetherian and confluent rewriting system {aa ^ 1 \ aGF}. For G F* we 
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denote by w the freely reduced word which denotes the same group element 
in -F(E) as w. Hence, i^iu) = ip{v) if and only ii u = v in T*. 
The class of rational languages in -F(S) is inductively defined as follows: 
Every finite subset of -F(S) is rational. If Pi,P2 ^ F(S) are rational, then 
Pi U P2, Pi ■ P2, and Pi are rational. Hence, P C is rational if and 

only if P = '>p{P') for some regular language P' C r*.Q In particular, we can 
use a non-deterministic finite automata over F for specifying rational group 
languages over P'(S). 

The following proposition is due to M. Benois [IH, see also 0, Sect. HI. 2]. 

Proposition 1 Let P' (1 T* be a regular language and P = ip{P') C 
Then we effectively find a regular language P' C V* such that P' = {w E 
r* I iIj{w) ^ P}- Hence, the complement of P is the rational group language 
ip{P') and the family of rational group languages is an effective Boolean al- 
gebra. 

Proof. (Sketch) Using the same state set (and some additional transitions 
which are labeled with the empty word) we can construct (in polynomial 
time) a finite automaton which accepts the following language 

p" = {ver \ 3ueP' -.u^v} 

where u ^ v means that v is a descendant of u by the convergent rewriting 
system {aa— >l|aGr}. Then we complement P" with respect to F*; and 
we build the intersection with the regular set of freely reduced words. □ 

3 The Existential Theory 

In the following Q denotes a finite set of variables (or unknowns) and we let 
~ : f2 — > be an involution without fixed points. Clearly, if X E Q has an 
interpretation in P(S), then we read X as X~^ G P(S). 
The existential theory of equations with rational constraints in free groups is 
inductively defined as follows. Atomic formulae are either of the form W = 1, 
where W e (TU Q)* or of the form X e P, where X is in and P C F(S) is 

"'^We follow the usual convention to call a rational subset of a free monoid regular. This 
convention is due to Kleene's Theorem stating that regular, rational, and recognizable 
have the same meaning in free monoids. But in free groups these notions are different and 
we have to be more precise. 
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a rational language. A prepositional formula is build up by atomic formulae 
using negations, conjunctions and disjunctions. The existential theory refers 
to closed existentially quantified propositional formulae which evaluate to 
true over -F(S). 

Theorem 2 The following problem is FSP ACE -complete. 
INPUT: A closed existentially quantified propositional formula with rational 
constraints in the free group -F(S) for some finite alphabet S. 
QUESTION: Does the formula evaluate to true over F(S) ? 



The PSPACE-hardness follows from a result of Kozen |]T3[, since (due to 
the constraints) the empty intersection problem of regular sets can easily be 
encoded in the problem above. The same argument applies to Theorems | 
and 1^ below and therefore the PSPACE-hardness is not discussed further in 
the sequel: We have to show the inclusion in PSPACE, only. 
The PSPACE algorithm for solving Theorem ^ will be described by a (highly) 
non-deterministic procedure. We will make sure that if the input evaluates to 
true, then at least one possible output is true. If it evaluates to false, then no 
(positive) output is possible. By standard methods (Savitch's Theorem) such 
a procedure can be transformed into a polynomial space bounded determin- 



istic decision procedure, see any textbook on complexity theory, e.g. [|T2|, |23 
We start the procedure as follows. Using the rules of DeMorgan we may 
assume that there are no negations at all, but the atomic formulae are now 
of the either form: W = l, W^l, XeP, X^P with W e {T U fi)*, 
X e n, and P C F(S) rationalQ 
The next step is to replace every formula 7^ 1 by 

3X : WX = 1AX ^ {1}, 

where X is a fresh variable, hence we can put 3X to the front. Now we 
eliminate all disjunctions. More precisely, every subformula of type AV B 
is non-deterministically replaced either by A or by B. At this stage the 
propositional formula has become a conjunction of formulae of type W = 1, 
X e P, X ^ P with W e{TU ny, X en, and PC F(E) rational. 
We may assume that \W\ > 3, since if 1 < \W\ < 3, then we may replace 
W = 1 by Waa = 1 for some a G P. For the following it is convenient to 

^The reason that we keep X ^ P instead of X G F where P — F{J^) \ P is that the 
complementation may involve an exponential blow-up of the state space; this has to be 
avoided. 
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assume that \W\ = 3 for all subformulae W = 1. This is also easy to achieve. 
As long as there is a subformula e r U for 1 < 2 < A; and 

A; > 4, we replace it by the conjunction 

3y : X1X2Y = 1 A Yx^ ■ ■ ■ Xfc = 1, 

where y is a fresh variable and JY is put to the front, and then proceed 
recursively. This finishes the first phase. The output of this phase is a system 
of atomic formulae of type W = l,XeP,X^P with W G (^U^])^ X eVt, 
and P C F(S) rational. 

At this point we switch to the existential theory of equations with regular 
constraints in free monoids where these monoids have an involution. Recall 
that X e P (resp. X ^ P) means in fact X e iIj{P') (resp. X ^ ip{P')) where 
P' C r* is a regular word language specified by some finite non-deterministic 
automaton. Using ^/'-symbols we obtain an interpretation over (F*, ~) without 
changing the truth value by replacing syntactically each subformula X G P 
(resp. X ^ P) by i{j{X) G i/j{P') (resp. ip{X) ^ tp{P')) and by replacing 
each subformula W = Ihy ^ipiW) = 1. 

We keep the interpretation over words, but we eliminate now all occurrences 
of i/j again. We begin with the occurrences of i(j in the constraints. Let 
P' C r* be regular being accepted by some finite automaton with state 
set Q. As stated in the in the first part of the proof of Proposition |l], we 
construct a finite automaton, using the same state set, which accepts the 
following language 

p" = {v er* \3ue P' -.u^v}. 

In particular, i(j{P') = ^{P") and P C P" where P = {u eV* \ u e P' ]. 
We replace all positive atomic subformulae of the form il){X) G ip{P') by 
X G P". A simple reflection shows that the truth value has not changed 
since we can think of X of being a freely reduced word. For a negative 
formulae i^{X) ^ ip{P') we have to be a little more careful. Let C F* be 
the regular set of all freely reduced words. The language A^ is accepted by a 
deterministic finite automaton with |F| + 1 states. We replace i/^{X) ^ il^{P') 
by 

X ^ P" AX e N, 
where P" is as above. Again the truth value did not change. 
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We now have to deal with the formulae ilj{xyz) = 1 where x,y, z G F U f2. 
Observe that the underlying propositional formula is satisfiable over T* if 
and only if it is satisfiable in freely reduced words. The following lemma is 
well-known. Its easy proof is left to the reader. 

Lemma 3 Let u,v,w ^ T* be freely reduced words. Then we have ifj{uvw) = 
1 (i.e. uvw = 1 in F{T,)) if and only if there are words P,Q,R G F* such 
that u = PQ, V = QR, and w = RP holds in F*. 

Based on this lemma we replace each atomic subformulae 'i/j{xyz) = 1 with 
X, y, 2; G F U i7 by a conjunction 

3P3Q3R ■.x = PQAy = QRAz = RP, 

where P, Q, R are fresh variables and the existential block is put to the front. 
The new existential formula has no occurrence of i/j anymore. The atomic 
subformulae are of the form x = yz, X E P, X ^ P, where x, ?/, 2; G F U 
and P C F* is regular. The size of the formula is linear in the size of the 
original formula. Therefore Theorem ^ is a consequence of Theorem ^. 

4 Free Monoids with Involution 

As above, let F be an alphabet of constants and fl be an alphabet of variables. 
There are involutions ~ : F — >^ F and ~ : Q ^ Q. The involution on Q is 
without fixed points, but we explicitly allow fixed points for the involution 
on F. The involution is extended to (F U Q)* by Xi - ■ ■ Xn = x^ - ■ -Xi for 

77, > and Xi E T U Q, 1 < i < n. 

From now on, all monoids M under consideration are equipped with an 
involution ~ : M — ^ M, i.e. we have 1 = 1 for the unit element, x = x, and 
xy = yx for all x, 2/ G M. A homomorphism between monoids M and M' is 
therefore a mapping h : M ^ M' such that /i(l) = 1, h{xy) = h{x)h{y), and 
h{x) = h{x) for all x,y E M. The pair (F*,~) is called a free monoid with 
involution. 

■^Fixed points for the involution on constants are needed in the proof later anyhow and 
this more general setting leads to further applications, Q 

"'Note that (r*,~) is a free monoid which has an involution, but it is not a free object 
in the category of monoids with involution, as soon as the involution has fixed points. 
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The existential theory of equations with regular constraints in free monoids 
with involution is based on atomic formulae of type U = V where U,V & 
(r U fi)* and of type X E P where X E Q and P C F* is a regular language 
specified by some non-deterministic finite automaton. Again, a propositional 
formula is build up by atomic formulae using negations, conjunctions and 
disjunctions. The existential theory refers to closed existentially quantified 
propositional formulae which evaluate to true over (r*,~). 
The following statement is the main result of the paper. 

Theorem 4 The following problem is PSP ACE -complete. 

INPUT: A closed existentially quantified propositional formula with regular 

constraints in a free monoid with involution over (r,~). 

QUESTION: Does the formula evaluate to true over (r*,~) ? 

The proof of Theorem ^ is in a first step (next section) a reduction to Theo- 
rem 1^. The proof of Theorem ^ will be the essential technical contribution. 

5 From Regular Constraints to Boolean Ma- 
trices and a Single Equation 

The first part of the proof is very similar to what we have done above. By 
DeMorgan we have no negations and all subformulae are of type U = V, 
U ^ V, X e P, X ^ P, where U,V e (F U fi)*, X G fi, and P C T* is 
regular. 

Since we work over a free monoid T* it is easy to handle inequalities U V 
where U,V G (F U We recall it under the assumption |F| > 2: A 

subformulae U V is replaced by 

3X3Y3Z -.yiU = VaX W = UaX W {U = XaY AV = XhZ)). 

Making guesses we can eliminate all disjunctions and we obtain a proposi- 
tional formula which is a single conjunction over subformulae of type U = V , 
X e P, and X ^ P where U,V eiVU Vtf , X G and P C F* is regular. 
By another standard procedure we can replace a conjunction of word equa- 
tions over (F U by a single word equation L = R with L, R & (F U Q)~^. 
For example, we may choose a new letter a and then we can replace a system 
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Li = Ri, L2 = R2, ■ ■ ■ , Lk = Rk LiaL2a ■ ■ ■ aL^ — RiaR2a ■ ■ ■ aRk and a 
list X G r* for all X eVL] this works since a ^V. 

Therefore we may assume that our input is given by a single equation L = R 
with L,R e (r U and by two lists {Xj e Pj,l < j < m) and {Xj ^ 
Pj,m < j < k) where Xj e Q, and each regular language Pj C F* is specified 
by some non-deterministic automaton Aj — {Qj, F, 5j, Ij, Fj) where Qj is the 
set of states, 5j C Qj x T x Qj is the transition relation, Ij C Qj is the 
subset of initial states, and Fj C Qj is the subset of final states, 1 < j < A;. 
Of course, a variable X may occur several times in the list with different 
constraints, therefore we might have k greater than The question is 
whether there is a solution. 

A solution is a mapping cr : Q — > F* being extended to a homomorphism 
(T : (F U ^ F* by leaving the letters from F invariant such that the 
following conditions are satisfied: 

a{L) = a{Rl 

a(X) = a{X) forXeQ, 

a{Xj) e Pj for I <j <m, 

a{Xj) ^ Pj for m < j < k. 

For the next steps it turns out to be more convenient to work within the 
framework of Boolean matrices instead of finite automata: Let Q be the 
disjoint union of the state spaces Qj, 1 < j < k. We may assume that 
Q — {1,. . . , n}. Let 5 — Ui<j<fc ^j^ ^^^^ S C Q xT x Q and with each a e F 
we can associate a Boolean n x n matrix g{a) G B"^" such that g{a)ij = 
"(z, a,j) G d" for 1 < i, j < n. Since our monoids should have an involution, 
we shall in fact work with 2n x 2n matrices. Henceforth M C ]32nx2n fje^otes 
the following monoid with involution: 

M={(^Q s) I ASgB"X"}, 

where 

[A 0\ _(B^ \ 
1^0 5^ ~ V A^J 

and the operator ^ denotes the transposition. We define a homomorphism 
h:V* ^ Mhj 
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where the mapping g : T ^ B"^'^ is defined as above. The homomorphism h 
can be computed in polynomial time and it respects the involution. Now, for 
each regular language Pj, 1 < j < k we compute vectors Ij,Fj e B^'^ such 
that for all w G r* and 1 < j < A; we have the equivalence: 

wePj^ lJh{w)Fj = 1. 

Having done these computations we make a non-deterministic guess p{X) G 
M for each variable X e Q. We verify p{X) = p{X) for al\ X E VL and 
whenever there is a constraint of type X G Pj for some 1 < j < m (or 
X ^ Pj for some m < j < k), then we verify Ij p{X)Fj = 1, if 1 < j < m 
(or lJp{X)Fj ^0,ifm< j <k). 

After these preliminaries, we introduce the formal definition of an equation 
E with constraints: Let d, n G N and let M C B^"^^" be the monoid with 
involution defined above. We consider an equation of length d over some 
r and Q, with constraints in M being specified by a list E containing the 
following items: 

• The alphabet (r,~) with involution. 

• The homomorphism h : V* M which is specified by a mapping 
/i : r — >■ M such that h{a) = h{a) for all a G F. 

• The alphabet with involution without fixed points. 

• A mapping p : Q — > M such that p{X) — p{X) for all X e fl. 

• The equation L = i? where L,Re (r U 0)+ and \LR\ = d. 
We will denote this list simply by 

E = {r,h,Q,p]L = R). 

A convenient definition for the input size is given hy n + d + log2(|r| + \ 
This definition takes into account that there might be constants or variables 
with constraints which are not present in the equation. Recall that n refers 
to the dimension of the boolean matrices, and this parameter is part of the 
input. 

A solution of E' is a mapping a : fl T* (being extended to a homomorphism 
C7 : (r U ^ r* by leaving the letters from F invariant) such that the 
following three conditions are satisfied: 

a{L) = a{R), 
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a{X) = a{X)ioT aWX eQ, 
ha{X) = p{X)foi aWX en. 

By the reduction above, Theorem ^ is a consequence of the next statement 
which says that the satisfiabihty problem of equations with constraints can 
be solved in polynomial space. 

Theorem 5 The following problem is PSP ACE -complete. 

INPUT: An equation Eq with constraints Eq = (Fq, /loi ^o, Poi -^o = Ro)- 

QUESTION: Is there a solution a : fio ^ T^? 

For the proof we need an explicit space bound. Therefore we fix some poly- 
nomial p and and we allow working space p{n + d + log2(|F| + An 
appropriate choice of the polynomial p can be calculated from the presenta- 
tion below. What is important is that the notions of admissibility being used 
in the next sections always refer to some fixed polynomials. The following 
lemma states that some basic operations, which we have to perform several 
times can be done in PSPACE. 

Lemma 6 The following two problems can be solved in polynomial space with 
respect to the input size n + log(|F|). 

INPUT: A matrix A E M and a mapping h :T ^ M . 
QUESTION: Is there some w eV* such that h{w) = A? 
INPUT: A matrix A E M and a mapping h : T —>■ M. 
QUESTION: Is there some w eT* such that h{w) = A and w = w? 

Proof. The first question can be solved by guessing a word w letter by 
letter and calculating h{w). The second question can be solved since w = w 
implies w = uau for some m G F* and a G F U {1} with a = a. Hence we can 
guess u and a. During the guess we compute B = h{u) and then we verify 
A = Bh{a)B. □ 
Here is a first application of Lemma Assume that an equation with con- 
straints E = {T, h,Q, p; L = R) contains in the specification some variable 
X which does not occur in LRLR, then the equation might be unsolvable, 
simply because p{X) ^ /i(F*). However, by the lemma above we can test this 
in PSPACE. If p{X) G h{T*), then we can safely cancel X and X. Thus, 
we put this test in the preprocessing, and in the following we shall assume 
that all variables occur somewhere in LRLR. In particular, we may assume 
\n\ < 2\LR\. 
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6 The Exponent of Periodicity 



A key step in proving Theorem |^ is to find a bound on the exponent of 
periodicity in a minimal solution. This idea is used in all known algorithms 
for solving word equations in general, c.f., [T^, |26| . 

Let 10 G r* be a word. The exponent of periodicity exp{w) is defined by 

exp{w) = sup{ a G N I 3u, v,p G T*, p 1 : w = up^v }. 

We have exp(w) > if and only if w is not the empty word. Let E = 
(r, h, fl, p, L = R) be an equation with constraints. The exponent of period- 
icity of E is also denoted by exp(i?). It is defined by 

exp{E) = inf{{ exp(cr(L)) | a is a solution of E} U {co}}. 

By definitions we have exp(£') < oo if and only if E is solvable. Here we 
show that the well-known result from word equations ||13[ transfers to the 
situation here. The exponent of periodicity of a solvable equation can be 
bounded by a singly exponential function. Thus, in the following sections we 
shall assume that if Eq is solvable, then exp {Eq) G 2^(^+"^°g'^). This is the 
content of the next proposition. 

Proposition 7 Let E = (T, h, Q, p; L = R) be an equation with constraints 
and let a : Q r* be a solution. Then we find effectively a solution a' : 
n^T* such that exp{a'{L)) G 2'^('^+"i°s"). 

The rest of this section is devoted to prove Proposition 0. Since it follows 
standard lines, the proof can be skipped in a first reading. 
Proof. Let p G be a primitive word. In our setting the definition of the 
p-stable normal form of a word w & A* depends on the property whether or 
not p is a factor of p^. So we distinguish two cases and in the following we 
also write p~^ for denoting p. Then, for example, p~^ means the same as p^. 
First case: We assume that p is not a factor of p"^. The idea is to replace 
each maximal factor of the form jo" with a > 2 by a sequence p,a — 2,p and 
each maximal factor of the form p°' with a > 2 by a sequence p, — {a — 2), p. 
This leads to the following notion: 

The p-stable normal form (first kind) of to G A* is a shortest sequence {k is 
minimal) 
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such that A; > 0, Uo,Ui e A*, Si G {+1, —1}, ccj > for 1 < i < A;, and the 
following conditions are satisfied: 

• A; = if and only if neither nor is a factor of w. 

• lik>l, then: 

uo e A*p''\A*p^^A*, 

Ui e {A*p'^+' n p'^A*) \ A*p'^^A* for 1 < i < /c, 
Uk e p"'A*\A*p^''A*. 

The p-stable normal form of w becomes 

Example 8 Let p = aabaa with b and w = p'^baap~^aabp~^ . Then the 
p-stahle normal form of w is: 

{aaab, 2, aabaabaa, —1, aabaabaa, 0, aabaa). 

Second case: We assume that p is a factor of p^. Then we can write p = rs 
with p — sr and r — f, s — 's. We allow r = 1, hence the second case includes 
the case p — p. In fact, if r = 1, then below we obtain the usual definition of 
p-stable normal form. Moreover, by switching to some conjugated word of p 
we could always assume that r G {1, a} for some letter a being fixed by the 
involution, a = a, but this switch is not made here. The idea is to replace 
each maximal factor of the form (rs)"r with a > 2 by a sequence rs,a — 2, sr. 
In this notation a — 2 is representing the factor (rs)°~^r = p"~^r = rp°'~'^. 
The p-stahle normal form (second kind) of w G A* is now a shortest sequence 
[k is minimal) 

(mo,c>;i,mi, . . . ,ak,Uk) 

such that k > 0, Uo,Ui G A*, ctj > for 1 < i < k, and the following 
conditions are satisfied: 

• w — Uop°'^rui ■ • ■p°'''ruk. 
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• A; = if and only if p^r is not a factor of w. 

• If A; > 1, then: 

uq G A*rs \ {A*p^rA* U A*rsrs), 

Ui G {A*rs n srA*) \ {srsrA* U A*p^rA* U A*rsrs) for 1 < z < A;, 
Mfc G srA* \ (AV'^-^* U srsrA*). 

Since fs = sr, the p-stable normal form of w becomes 

(It^, afc,Mi, . . . ,ai,lM)). 
So, for the second kind no negative integers interfere. 

Example 9 Let p = aab with b = b. Then r = aa and s = b. Let w = 

ap'^ap^a Then the p-stable normal form of w is: 

{abaab, 2, baabaaab, 0, baaba). 
In both cases we can write the p-stable normal form of as a sequence 

{uo,ai,ui, . . . ,ak,Uk) 
where Ui are words and a, are integers. 

For every finite semigroup S there is a number c{S) such that for all s G S* 
the element s'^^^^ is idempotent, i.e., s^^^^ = s'^^^^\ It is clear that the number 
c(M) for our monoid M C B^"^^"^ is the same as the number c{M^^"'). It is 
well-known that we can take c(B"^") = n\ (it is however more convenient 
to define c(M) = 3 for n = 1). Hence in the following c(M) = max{3,n!}. 
For specific situations this might be an overestimation, but this choice guar- 
antees h{uv''^^'>w) = h{uv'^''^^^^w) for all u,v,w e T* and all h:T*^M. 
Now, let w, w' G r* be words such that the p-stable normal forms are identical 
up to one position where for w appears an integer a, and for w' appears an 
integer a-. We know h{w) = h{w') whenever the following conditions are 
satisfied: a, ■ a- > 0, \ai\ > c(M), |a-| > c(M), and a, = a- (mod c(M)). 
Then we have h{w) = h{w'). This is the reason to change the syntax of the p- 
stable normal form. Each non-zero integer a' is written as a' = e{q + ac{M)) 
where e,q,a are uniquely defined by e G {+1,-1}, < q < c(M), and 
a > 0. For a' = we may choose e = q = a = 0. We shall read a as a 
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variable ranging over non-negative integers, but e, q, and c(M) are viewed 
as constants. In fact, if \a'\ < c(M), then we best view a also as a constant 
in order to avoid problems with the constraints. 

Let u, V, and w be words such that uv = w holds. Write these words in their 
p-stable normal forms: 

u: (uo,£i(gi + aic{S)),ui, . . . ,ek{qk + akc{S)),Uk), 
v: {vq, e[{si + (3ic{S)),vi, . . . ,e'f,{si + Pic{S)),ve), 
w: {wo,e'l{ti + 'yic{S)),Wi, . . . ,e'^{tm + 1mc{S)),Wm)- 

Since uv = w there are many identities. For example, for k,i > 2 we have 
Mo = wq, vi = Wm, Qi = ti, ai = 7i, etc. What exactly happens depends only 
on the p-stable normal form of the product UkVQ. There are several cases, 
which easily can be listed. We treat only one of them, which is in some sense 
the worst case in order to produce a large exponent of periodicity. This 
is the case where p = rs with r = r and s = s. Then it might be that 
Uk = srsri and Vq = r2srs with rir2 = r (and ri 7^ 1 7^ r2). Hence we 
have UkVo = sp^ and k + i = m + 1. It follows ai = 71, . . . , ak-i = Jk-i, 
(32 = Jk+i, ■ ■ ■ , Pe = 7m, and there is only one non-trivial identity: 



Qk + si + 4 + {ak + (3i)c{S) =tk + jkc{S). 

Since by assumption c{S) > 3, the case UkVo = sp^ leads to the identity: 

Ik = C(k + Pi + c with c G {0, 1, 2}. 

Assume now that at > 1 and /?i > 1. If we replace a^. Pi, and 7^ by some 
«fc > 1, > 1, and 7fc > 1 such that still 7^ = «fc + /3( + c, then we obtain 
new words u', v', and w' with the same images under h in M and still the 
identity u'v' = w' . 

What follows then is completely analogous to what has been done in detail 
T3| , p^ , |ll|, ^ . Using the p-stable normal form we can associate with an 



m 



equation L = Roi denotational length d together with its solution o" : — > F* 
some linear Diophantine system of d equations in at most 3d variables. The 
variables range over natural numbers since zeros are substituted. (In fact the 
number of variables can be reduced to be at most 2|n|). The parameters of 
this system are such that maximal size of a minimal solution (with respect to 
the component wise partial order of N'^) is in 0{2^-^'^) with the same approach 
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as in 



TBI 



m 



a more 



This tight bound is based in turn on the work of 
moderate bound 2^'^'^^ (which is enough for our purposes) is easier to obtain, 
see e.g. 0]. The maximal size of a minimal solution of the linear Diophantine 
system has a backward translation to a bound on the exponent of periodicity. 
For this translation we have to multiply with the factor c{M) G 2^'^"'^°^"^ 
and to add c(M) + 1. Putting everything together we obtain the claim of the 
proposition. □ 



7 Exponential Expressions 

During the procedure which solves Theorem ^ various other equations with 
constraints are considered but the monoid M will not change. 
There will be not enough space to write down the equation L = R in plain 
form, in general. In fact, there is a provable exponential lower bound for the 
length \LR\ in the worst case which we can meet during the procedure. In 
order to overcome this difficulty Plandowski's method uses data compression 
for words in (F U Q)* in terms of exponential expressions. 
Exponential expressions (their evaluation and their size) are inductively de- 
fined: 

• Every word w G F* denotes an exponential expression. The evaluation 
eval(iy) is equal to w, its size \\w\\ is equal to the length \w\. 

• Let e, e' be exponential expressions. Then ee' is an exponential expres- 
sion. Its evaluation is the concatenation eval(ee') = eval(e)eval(e'), its 
size is ||ee'|| = ||e|| + ||e'||. 

• Let e be an exponential expression and A; G N. Then (e)'^ is an expo- 
nential expression. Its evaluation is eval((e)'^) = (eval(e))'^, its size is 
||(e)'''|| = log(fc) + ||e|| where log(fc) = max{l, [log2(A;)]}. 

It is not difficult to show that the length of eval(e) is at most exponential 
in the size of e, a fact which is, strictly speaking, not needed for the proof 
of Theorem |^. What we need however is the next lemma. Its proof can be 
done easily by structural induction and it is omitted. 

Lemma 10 Let m G F* be a factor of a word w ^ T* . Assume that w can 
be represented by some exponential expression of size p. Then we find an 
exponential expression of size at most p^ that represents u. 
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We say that an exponential expression e is admissible, if its size ||e|| is 
bounded by some fixed polynomial in the input size of Eq. The lemma above 
states that if e is admissible, then we find admissible exponential expressions 
for all factors of eval(e). But now the admissibility is defined with respect 
to some polynomial which is the square of the original polynomial, so, in a 
nested way, we can apply this procedure a constant number of times, only. 
In our application the nested depth does not go beyond two. 
The next lemma is straightforward since we allow a polynomial space bound 
without any time restriction. Again, the proof is left to the reader. 

Lemma 11 The following two problems can be solved in PSPACE. 

INPUT: Exponential expressions e and e' . 
QUESTION: Do we have eval(e) = eval(e') ? 

INPUT: A mapping h : T ^ M and an exponential expression e. 
OUTPUT: The matrix /i(eval(e)) G M. 

Remark 12 The computation above can actually be performed in polynomial 
time, but this is not evident for the first question, see for details. 

Henceforth we allow that the part L = i? of an equation with constraints may 
also be given by a pair of exponential expressions (e^,, cr) with eval(eL) = L 
and eval(ei:j) = R. We say that E = (T,h,fl,p;eL = e/j) is admissible, if 
clCr is admissible, |r \ FqI has polynomial size, fl C Qq, and h{a) = /io(a) 
for a G r n To. 

For two admissible equations with constraints E = (F, h, Q, p; cl = cr) and 
E' = (T,h,Q, p;e']^ = e'j^ we write E = E' , if eval(e2,) = eval(e^) and 
eval(e/j) = eval(e'^) as strings in (F U fi)*. This means that they represent 
exactly the same equations. 

8 Base Changes 

In this section we fix a mapping h : T M which respects the involution. 
Let (F',~) be an alphabet with involution and let P : T' —* T* he some 
mapping (3 such that P(a) = (3 (a) for all a G F'. We define h' : T' M such 
that h' = h(3. We also extend to a homomorphism /3 : (F' U fi)* ^ (F U fi)* 
by leaving the variables invariant. 
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Let E' = {T',h',n, p; L' = R'). be an equation with constraints. The base 
change P*{E') is defined by 

p,{E') = iT,h,n,p;PiL')=(3iR')). 

We also refer to /3 : F' — F* as a base change and we say that (3 is admissible, 
if |F'| has polynomial size and if /3(a) can be represented by some admissible 
exponential expression for all a e F'. 

Remcirk 13 If j3 : V ^ F* is an admissible base change and if V — R' is 

given by a pair of admissible exponential expressions, then we can represent 
I3^{E') by some admissible equation with constraints. A representation of 
P*{E') is computable in polynomial time. 



Lemma 14 Let E' be an equation with constraints and (3 : V ^ T* be a 
base change. If a' is a solution of E' , then a — (5a' is a solution of (3^,{E'). 

Proof Clearly a(X) = a{X) and ha{X) = hf3a'{X) = h'a\X) = p{X) for 
all X eil. Next by definition cr(a) = a for a G F and l3{X) = X hi X e ^l. 
Hence a/3{a) = Pa'{a) for a G F' and therefore a(3 = (3a' : (F' U VL)* T* . 
This means a(3{L) = (3a' (L) = (3a' {R) = a(3{R) since a'{L) = a'{R). □ 
The lemma above leads to the first rule. 

Rule 1 If E is of the form (3^{E') and if we are looking for a solution of E, 
then it is enough to find a solution for E' . Hence, during a non- deterministic 
search we may replace E by E' . 



Example 15 Consider the following equation E with constraints over F = 
{a, h, c, a, 6, c}; 

XX = YbcbabcbYZabcbY. 

Let there be the constraints for X and Z saying X G F^™F* and Z G 6c6aF*. 
Define F' = {a, 6, a, 6} and a base change (3 : V ^ V* by (3{a) — abcb and 
(3{b) — bob. Then the equation E is of the form (3*{E') where E' is given by 

XX = YabYZaY 
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and the new (and sharper) constraint for Z is simply Z G aV* , for X we 
may sharpen the constraint to X E r'^'^'^^F'* According to Rule 1 it is enough 
to solve E' . The effect of the base change [3 is that the equation E' is shorter 
and the alphabet of constants becomes smaller, since the letter c is not used 
anymore. Note also that the length restriction on X became smaller, too. 
However this has a prize; in general, E — (3^{E') might have a solution, 
whereas E' is unsolvable. As we will see later, our guess has been correct in 
the sense that E' still has a solution. 

9 Projections 

Let (r,~) and (r',~) be alphabets with involution such that (r,~) C (r',~). 
A projection is a homomorphism tt : F'* — > F* such that both 7r(a) = a for 
a G F and 7r(a) = 7r(a) for all a G F'. If : F — > M is given, then a projection 

TT defines also h' : V ^ M hy h' = hn. 

Let E be an equation with constraints E = (F, h, fl, p; L = R). Then we can 
define an equation with constraints tt*{E) by 

7r*{E) = {r',hTT,n,p;L^ R). 

The difference between E and n*{E) is only in the alphabets of constants 
and in the mappings h and h' = hn. Note that every projection tt : F'* — > F* 
defines a base change tt* such that 7r*7r*(-E') = E. 

Lemma 16 Let E = (F, h, n,p;L ^ R) and E' = (F', h' , n,p]L ^ R) be 
equations with constraints. Then the following two statements hold. 

i) There is a projection tt : F'* — > F* such that t^*{E) = E' , if and only if 
both h'{V') C h{V*) and for all a G F' with a — a there is some w G F* 
with w — w such that h'{a) — h{w). 

a) If we have 7r*{E) — E' and if a' : ^ F'* is a solution of E' , then we 
effectively find a solution a for E such that \(j{L)\ < 2\M\\a'{L)\. 

Proof. i) Clearly, the only-if condition is satisfied by the definition of a 
projection since then h' — hn. For the converse, assume that h'{T') C h{T*) 
and that a — a imphes h'{a) G h{{w eT* \ w — w}). Then for each a G F'\F 
we can choose a word Wa G F* such that h'{a) = h{wa). We can make the 
choice such that = tZ}^ for all a G F' \ F. If a 7^ a, then we can find Wa 
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such that \wa\ < |M|, since we can take the shortest word Wa G T* such that 
h{wa) = h'{a) G M. For a = a we know that there is some word Wa € F* with 
h'{a) = h{wa) and Wa = W^. Hence we can write Wa = vbv with 6 G F U {1} 
and b = b. For 6 7^ 1 we can demand \wa\ < 2\M\ — 1. For 6 = 1 we can 
demand \wa\ < 2\M\ — 2. Thus, we find a projection vr : F'* — > F* such that 
TC*{E) = E' and moreover, |7r(a)| <2\M\ for all a G F'. 
ii) Using the reasoning in the proof of i) we may assume that tt : F'* T* 
satisfies |vr(a)| < 2\M\ for all a G F'. Since tt defines a base change with 
n^{E') = E, we know by Lemma |l^ that a = na' is a solution of E. Clearly, 
\a{L)\ = \na'{L)\ < 2\M\\a'{L)\. □ 

Remark 17 In the following we will meet the problem to decide whether 
there is a projection tt : F'* — > F* such that 7r*(i?) = E' . We actually need 
not too much space for this test. It is not necessary to write down tt. We can 
use the criterion in the lemma above and Lemma Then we have to store 
in the working space only some Boolean matrices o/B^*^^^". In particular, if 
n is a constant (or logarithmically bounded in the input size), then the test 
37r : 'K*{E) = E' can be done in polynomial time. However, if n becomes a 
substantial part of the input size, then the test might be difficult in the sense 
that we might need the full power o/PSPACE. 

The lemma above leads now to the second rule. 

Rule 2 If TT is a projection and if we are looking for a solution of E, then 
it is enough to find a solution for tt*{E). Hence, during a non- deterministic 
search we may replace E by tt*{E). 



Example 18 Let us continue with the equation which has been obtained by 
the transformation in Example 15. In order to simplify notations, we will 
call E the equation XX = YabYZaY , and F = {a, b, d, b}. 
Remember that the constraint on X demanded a rather long solution. There- 
fore we may reintroduce a letter c and put T' = {a,b, c,a,b,c} . Then we 
may define a projection tt : F' ^ F* by, say, 7r(c) = b^^^ . The equation 
E' = TT*{E) looks as above, but in E' we may change the constraint for X . 
We may sharpen the new constraint for X to be X E F*cF*. Thus, the 
solution for X might be very short now. 
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10 Partial Solutions 



Let fl' Q fl he a. subset of the variables which is closed under involution. We 
assume that there is a mapping p' : Jl' — > M with p'{x) = p'{x), but we do 
not require that p' is the restriction of p : — > M. Consider an equation 
with constraints E = {r,h,Q, p; L = R). A partial solution is a mapping 
S : fl ^ r*fl'r* U r* such that the following conditions are satisfied: 



iii) S{X) = S{X) for all X e fl. 

The mapping 5 is extended to a homomorphism 5 : (F U Q)* — > (F U fl')* by 
leaving the elements of F invariant. Let E' = (F, h, Q', p'; L' = R') be another 
equation with constraints (using the same F and h). We write E' = 
if there exists some partial solution S : Q ^ F*r2F* U F* such that the 
following conditions hold: L' = S{L), R! = S{R), p{X) = h{u)p'{X)h{v) for 
5{X) = uXv, and p{X) = h{w) for 5{X) = w e F*. 

Lemma 19 In the notation of above, let E' = for some partial solution 

S : fl ^ r*fir* U F*. If a' is a solution of E' , then a = a' 5 is a solution of 
E. Moreover, we have cr(L) = cf'{L') and a{R) = a'{R'). 

Proof. By definition, 6 and a' are extended to homomorphisms S : (TUQ)* — > 
(F U Q'Y and a' : (F U Q')* T* leaving the letters of F invariant. Since 
E' = 6^:{E) we have 6{L) = L' and S{R) = R'. Since a' is a solution, we have 
a{L) = a'5{L) = a'{L') = a'{R') = a'S{R) = a{R) and a leaves the letters of 
F invariant. The solution a' satisfies ha'{X) = p'{X) for all X G Q'. Hence, 
if S{X) = uXv, then p{X) = h{u)p'{X)h{v) = h{ua'{X)v) = ha\uXv) = 
ha'6{X) = ha{X). If S{X) = w E T*, then a{X) = a'6{X) = w and 
p{X) = h{w), again by the definition of a partial solution. □ 

Lemma 20 The following problem can be solved in PSPACE. 

INPUT: Two equations with constraints E — (F,/i, Q,p;ei = en) and E' — 

{T,h,n',p';eL' = e^/). 

QUESTION: Is there some partial solution 6 such that 6^{E) = E'? 



i) 5{X) e F*XF* 



for all X en', 



ii) S{X) e T* 



for allX en\ n', 
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Moreover, if S^:{E) = E' is true, then there are admissible exponential ex- 
pressions Cu, for each X & Q' and an admissible exponential expression 
Gw for each X & such that 



Proof. Let L = eval(ei), R = eval(eii;), L' = eval(ei/), and R' = eval^en'). 
The non-deterministic algorithm works as follows: 

For each X & Q' we guess admissible exponential expressions e„ and with 
eval(eu), eval(ei,) G F*. We define an exponential expressions ex = e„Xe^ 
and S{X) = eval(ex)- For each X G we guess an admissible exponential 
cx with eval(ex) G F* and S{X) = eval(ex)- 

Next we verify whether or not = E'. During this test we have to 

create an exponential expression fi (and fji, resp.) by replacing X in cj^ 
(and en, resp.) with the expression ex- This increases the size in the worst 
case by a factor of max{||ex|| | X G Q}. The other tests whether p{X) = 
h{u)p\X)h{v) for 6{X) = uXv and p(X) = h{w) for 5{X) = weV* involve 
admissible exponential expressions over Boolean matrices and can be done 
in polynomial time. 

The correctness of the algorithm follows from our general assumption that 
all X G appear in LRLR. Therefore, if we have 5*(i?) = E', then 6{X) (or 
S{X)) appears necessarily as a factor in L'R' = 6{LR). Hence 6{X) has an 
exponential expression of polynomial size by Lemma Therefore guesses 
of e„, e^, and as above are possible without running out of space. □ 

Remark 21 Actually, the test for 6^{E) = E' can be performed in non- 
deterministic polynomial time by Remark [7^ . 

The lemma above leads to the third and last rule. 

Rule 3 If 6 is a partial solution and if we are looking for a solution of E, then 
it is enough to find a solution for 6^:{E). Hence, during a non- deterministic 
search we may replace E by 6^{E). 

Remark 22 We can think of a partial solution 6 : Q ^ T*Q'T* U F* in 
the following sense. Assume we have an idea about a{X) for some X E Q. 
Then we might guess (y{X) entirely. In this case we can define 6{X) = o"(X) 



6{X) 
6{X) 



eval ( e„ ) Xeval ( e„ ) 
eval(e^) 



for X G n', 

ioT X en \ n'. 
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and we have X ^ Q' . For some other X we might guess only some prefix 
u and some suffix v of a{X). Then we define 6{X) = uXv and we have 
to guess some p\X) G M such that p{x) : h{u) p' {X)h{y) . If our guess was 
correct, then such a matrix p'{X) G M must exist. We have partially specified 
the solution and applying Rule 3, we continue this process by replacing the 
equation L = R by the new equation 6{L) = 6{R). 

Example 23 We continue with our running example. After renaming, the 
equation E is given by 

XX = YabYZaY, 

and the alphabet of constant is given by T = {a, b, c, a, b, c}. The constraints 
are X G T*cT* and Z G a{a, b, a,b}*. 

We may guess the partial solution as follows: S{X) = aX , S{Y) = Y , and 
6{Z) = ab. The new equation 6^:{E) is 

aXXa = YabYabaY . 

The remaining constraint is that the solution for X has to use the letter c. 
The process can continue, for example, we can apply Rule 1 again by defining 
another base change (3{b) = ba to get the equation 

aXXa = YbYabY 

over r = {a,b, c,a,b,c} . Since the last equation has a solution (e.g., given 
by cr{X) = bccbbabc and <7{Y) = abccb), the first equation with constraints in 
Example [7^ has a solution too. 

11 The Search Graph and Plandowski's Al- 
gorithm 

In the following we show that there is some fixed polynomial (which can be 
calculated from the presentation below) such that the high-level description 
of Plandowski's algorithm is as follows: On input Eo compute the maximal 
space bound, given by the polynomial, to be used by the procedure. Then 
apply non-deterministically Rules 1,2, and 3 until an equation with a trivial 
solution is found. 
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From the description above it follows that the specification of the algorithm 
just uses Rules 1, 2, 3. The algorithm is simple but it demands a good 
heuristics to explore the search graph. The hard part is to prove that this 
schema is correct; for this we have to be more precise. 

The search graph is a directed graph: The nodes are admissible equations 
with constraints. For two nodes E, E', we define an arc E E' , if there are 
an admissible base change /?, a projection vr, and a partial solution 6 such 
that 6,(71* (E)) = (3,{E'). 

Lemma 24 The following problem can he decided in PSPACE. 
INPUT: Admissible equations with constraints E and E' . 
QUESTION: Is there an arc E ^ E' in the search graph? 

Proof. We first guess some alphabet (r'',~) of polynomial size together with 
h" : F'' — > M. Then we guess some admissible base change /? : F' — >■ F"* such 
that h' = h" (3 and we compute (3^:{E'). 

Next we guess some admissible equation with constraints E" which uses 
F" and Q. We check using Lemma ^ that there is some partial solution 
6 -.n^ T"*n'T"* U F"* such that 6,{E") = (3,{E'). (Note that every equa- 
tion with constraints E" satisfying S,[E") = (3,{E') for some d is admissible 
by Lemma [1^.) Finally we check using Remark ^ and that there is some 
projection ^ : F" ^ F such that it*{E) = E" . We obtain 6,{tx*{E)) = [3,{E'). 
□ 



Remark 25 Following Remarks [7^ andWI the problem in Lemma can be 



decided in non- deterministic polynomial time, if the monoid M is not part 
of the input and viewed as a constant. If, as in our setting, M is part of 
the input, then PSPACE is the best we can prove, because the test for the 
projection becomes difficult. 

Plandowski's algorithm works as follows: 

begin 

E:=E^ 

while do 

Guess an admissible equation E' with constraints 
Verify that E E' is an arc in the search graph 
E := E' 
endwhile 

return "eval(ei) = eval(e/{)" 

end 
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By Rules 1-3 (Lemmata |T^, |TB]n), and|T^), ii E E' is an arc in the search 
graph and E' is solvable, then E is solvable, too. Thus, if the algorithm 
returns true, then E^ is solvable. The proof of Theorem |^ is therefore reduced 
to the statement that if Eq is solvable, then the search graph contains a path 
to some node without variables and the exponential expressions defining the 
equation evaluate to the same word. This existence proof is the hard part, 
it covers the rest of the paper. 

Remark 26 If E E' is due to some tt : V"* ^T*,5:VL^ ^"*^]T"*U^"^ 
and (3 -.V* T"* , then a solution cr' : ^ V* of E' yields the solution a = 
'k{(3(j')5. Hence we may assume that the length of a solution has increased 
by at most an exponential factor by Lemma ii) . Since we are going to 
perform the search in a graph of at most exponential size, we get automatically 
a doubly exponential upper bound for the length of a minimal solution by 
backwards computation on such a path. This is still the best known upper 



bound (although an singly exponential bound is conjectured), see j^/. 



12 Free Intervals 

In this section we introduce the notion of free interval in order to cope with 
long factors in the solution which are not related to any cut. If there were 
no constraints, then these factors would not appear in a minimal solution. 
In our setting we cannot avoid these factors. 

For a word w & T* we let {0, . . . , \ w\} be the set of its positions. The 
interpretation is that factors of w are between positions. To be more specific 
let w = ai - ■ ■ ttm, tti E T for 1 < i < m. Then [a, P] with < a < (3 < m 
is called a positive interval and the factor zt'[a!,/?] is defined by the word 

w[a, P] = tta+l ■ ■ ■ CLp. 

It is convenient to have an involution on the set of intervals. Therefore [/3, a] 
is also called an interval (but it is never positive), and we define ti'[/3,Q;] = 
We allow also a = (3 and we define i(;[a,Q;] to be the empty word. 
For all < a, < m we let [a, (3\ = [/?, a], then always w[a, (3] = w[a, [3]. 
Let us focus on the word Wq G Fq which in our notation is the solution wq = 
o-{Lq) = a{Ro), where Lq = xi - ■ -Xg and Rq = Xg+i ■ ■ - Xd, Xi G (Fq U Qq) 
for 1 < i < d. We are going to define an equivalence relation ^ on the set 
of intervals of Wq. For this we have to fix some few more notations. We let 
'^o = I'W^ol and for i G {1, . . . , d} we define positions l{i) G {0, . . . , mo — 1} 
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and T{i) G {1, . . . , mo} by the congruences 

l(z) = \a{xi ■ ■ ■ Xi_i)\ mod mo, 
r(i) = W{xi+i - ■ ■Xd)\ mod mo- 

This means, the factor cr{xi) starts in wq at the left position l(z) and it ends 
at the right position T{i). In particular, we have 1(1) = l{g + 1) = and 
r((yf) = r((i) = mo- The set of 1 and r positions is called the set of cuts. 
Thus, the set of cuts is {l(z),r(i) \ I < i < d}. There are at most d cuts. 
These positions cut the word wq in at most d — 1 factors. For convenience 
we henceforth assume 2 < g < d < mo whenever necessary. We make also 
the assumption that a{xi) ^ 1 for all 1 < i < d. This assumption can be 
realized e.g. by a first step in Plandowski's algorithm using a partial solution 
S which sends a variable X to the empty word, if (t{X) = 1 and sends X to 
itself otherwise. Another choice to realize this assumption is by a guess in 
some preprocessing. 

We have a{xi) = Wo[l(z), r(i)] and cr(x7) = iL'o[r(z), l(i)] for 1 < z < d. By our 
assumption, the interval [l(i),r(2)] is positive. Let us consider a pair 
such that i, j E I, . . . , d and Xj = Xj or Xi = x]. For G {0, . . . , T{i) — \{i)} 
we define a relation ~ by: 

+ + u] ~ [\{j)+fi,\{j) + iy],ifxi = Xj, 

[l(z) + fi, + ~ [r{j) - fi, r(j) - u], ii Xi = x]. 

Note that ~ is a symmetric relation. Moreover, [a,j3] ~ [ck',/9'] implies both 
[/3,a] ~ [/9',q;'] and i(;o[tt,/3] = wo[a',(3']. By ^ we denote the reflexive 
and transitive closure of ~. Then ^ is an equivalence relation and again, 
[a,f3] ^ [a', (3'] implies both [(3, a] ^ [P',a'] and wo[a,l3] = wo[a',P']. 
Next we define the notion of free interval. An interval [a, P] is called free, 
if whenever [a, (3] ~ [a',/?'], then there is no cut 7' with min{a',(3'} < 7' < 
max{a', /?'}. Clearly, the set of free intervals is closed under involution, i.e., if 
[a, P] is free, then [f3, a] is free, too. It is also clear that [a, P] is free whenever 
\P-a\< 1. 

Example 27 The last equation in Example ^ namely 

aXXa = YbYabY, 



26 



has a solution which yields the word 

1 _ 5 _ 6 9 _ 11 12 13 _ 17 18 

Wq = \ a \ bccb I b \ abc \ cb \ a \ b \ bccb | a | . 

The set of cuts is shown by the bars. The intervals [1,5], [13,17], and [6,9] 
are not free, since [1,5] ^ [17,13] ^ [7,11] and [6,9] ^ [0,3] and [7,11], [0,3] 
contain cuts. There is only one equivalence class of free intervals of length 
longer than 1 (up to involution), which is given by [1,3] ~ [17,15] ~ [7,9] ~ 
[11,9] ~ [5,3] ~ [13,15]. 

The next lemma says that subintervals of free intervals are free again. 

Lemma 28 Let [a,/?] be a free interval and such that mm{a,(3} < 
/i, z/ < max{a, /?}. Then the interval v] is also free. 

Proof. We may assume that a < ^ < v < (3. By contradiction assume that 
[/i, v\ is not free. Then there is some > and some cut 7' such that 

[/i, P] = [jlQ, Uo] ~ [//I, 1^1 ] ~ [/ifc, Z/fc] 

with min{/ifc, z/^} < 7' < maxj/i^, 1/^}. If A; = 0, then we have a immediate 
contradiction. For A; > 1 the relation [/i, z^] ~ [/Ui,z/i] is due to some pair 
with Since [a, /3] contains no cut, we can use 

the same pair to find an interval [q!i,/9i] such that [a, (3] ~ [tti,/3i] and 
/ii,z/i G {minjcti, /3i}, . . . , maxjai, Using induction on k we see that 
[ai,/3i] cannot be free. A contradiction, because then [a, (3] is not free. □ 
Next we introduce the notion of implicit cut for non-free intervals. For our 
purpose it is enough to define it for positive intervals. So, let < a < /3 < mg 
such that [a, {3] is not free. A position 7 with a < 7 < /3 is called an implicit 
cut of [a,/3], if we meet the following situation. There is a cut 7' and an 
interval [a', (3'] such that 

min{a',/3'} < 7' < max{a',/5'}, 

[«,/?] ^ [a',/31, 
7 — a = I7' — 

The following observation will be used throughout. If we have a < /i < 7 < 
u < (3 and 7 is an implicit cut of [a, /3], then 7 is also an implicit cut of [/i, z^]. 
In particular, neither [/i, z/] nor [z^, yu] is a free interval.^] 

""However, if 7 is an implicit cut of [/i, u], then it might happen that 7 is no imphcit 
cut of [q!,/3], ahhough [a,/3] is certainly not free. 
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Lemma 29 Let 0<a<a'<(3<(3'< mo such that [a, P] and [a', P'] are 
free intervals. Then the interval [a,f3'] is free, too. 

Proof. Assume by contradiction that [a, (3'] is not free. Then it contains an 
imphcit cut 7 with a < 7 < By the observation above: If 7 < then 7 
is an imphcit cut of [a, P] and [a, P] is not free. Otherwise, a' < 7 and a', P' 
is not free. □ 
We now consider the maximal elements. A free interval [a, P] is called maxi- 
mal free, if there is no free interval [a', P'] such that both a' < min{a, P} < 
max{a,/3} < P' and p' — a' > \P — a\. With this notion Lemma ^ states 
that maximal free intervals do not overlap. 

Lemma 30 Let [a,P] be a maximal free interval. Then there are intervals 
[7,5] and I'j'jS'] such that [a,P] ~ [7, 5] ~ [7',^'] and ^ and 6' are cuts. 

Proof. We may assume that [a, P] is a positive interval, i.e., a < p. We show 
the existence of [7, 5] where [a,P] ~ [7,5] and 7 is a cut. The existence of 
[7', 6'] where [a, P] ~ [7', 6'] and 6' is a cut follows by a symmetric argument. 
If a = 0, then a itself is a cut and we can choose 6 = p. Hence let 1 < a 
and consider the positive interval [a — 1, /?]. This interval is not free, but the 
only possible position for an implicit cut is a. Thus for some cut 7 we have 
[a -1,P] ^ [a',P'] with min{a',/3'} < 7 < max{a',^'} and I7 - a'\ = 1. A 
simple reflection shows that we have [a — 1,q;] ~ [a', 7] and [a,P] ~ [7,/3']- 
Hence we can choose 6 = P'. □ 

Proposition 31 Let T be the set of words zi; G Fq such that there is a max- 
imal free interval [a,P] with w = Wo[a,P]. Then T is a subset o/Fq of size 
at most 2d — 2. The set F is closed under involution. 

Proof. Let [a,P] be maximal free. Then |/3 — a| > 1 and [P,a] is maximal 
free, too. Hence F C F^ and F is closed under involution. By Lemma ^ we 
may assume that a is a cut. Say a < p. Then a 7^ mo and there is no other 
maximal free interval [a, P'] with a < P' because of Lemma 12^. Hence there 



are at most d — 1 such intervals [a, Symmetrically, there are at most d — 1 
maximal free intervals [a,P] where P < a and a is a cut. □ 
For a moment let Fq = Fq U F where F C F^j" is the set defined in Proposi- 
tion The inclusion Fq C Fq defines a natural projection tt : Fq — > Fq and 
a mapping Hq : Fq — > M by /iq = Hott. Consider the equation with constraints 
7r*{E), this is a node in the search graph, because the size of F is linear in d. 
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The reason to switch from Tq to Fq is that, due to the constraints, the word 
Wo may have long free intervals, even in a minimal solution. Over Fq long 
free intervals can be avoided. Formally, we replace wq by a solution w'q where 
w'q e F*. The definition of w'q is based on a factorization of wq in maximal 
free intervals. There is a unique sequence = ckq < ai < • • • < 0;^ = mo such 
that [cKj-i, CKj] is a maximal free interval for each 1 < i < k and 

Wo = wo[ao,ai] ■ ■ ■wo[ak-i,ak]. 

Note that all cuts occur as some ctp, therefore we can think of the factors 
Wo[ttj-i,ai] as letters in F for 1 < i < k. Moreover, all constants which 
appear in LqRq are elements of F. We replace Wq by the word Wq G F*. 
Then we can define a : Q ^ T* such that both cr(Lo) = <7{Ro) = w'q and 
Po = h'^a. In other terms, a is a solution of 7r*(£'o). We have wq — 7r(u»o) and 
exp(wQ) < exp(ti;o). The crucial point is that w'q has no long free intervals 
anymore. With respect to w'q and Fq all maximal free intervals have length 
exactly one. 

In the next step we show that we can reduce the alphabet of constants to 
be F. The inclusion of F in F'q defines an admissible base change /3 : F — > 
F'o. Consider E'q = {r,h,flo, Pq] Lq = Rq) where h is the restriction of the 
mapping /?/g. Then we have 7r*(£'o) = (3^{E'q). The search graph contains an 
arc from Eq to E'q, since we may choose 5 to be the identity. The equation 
with constraints E'q has a solution cr with (t(Lo) = w'q and exp(wQ) < exp(u'o). 
In order to avoid too many notations we identify Eq and E'q, hence we also 
assume wq = w'q. However, as a reminder that we have changed the alphabet 
of constants (recall that some words became letters), we prefer to use the 
notation F rather than Fo. Thus, in what follows we shall make the following 
assumptions: 

Eq — {T,h,no, po; Lq = Rq), 
Lq — xi - ■ - Xg and g > 2, 
Ro = Xg^i ■ --Xd and d> g, 
|F| < 2d -2, 
l^ol < 2d, 
M C B^"''^". 

Moreover: All variables X occur in LqRqLqRq. There is a solution a such 
that Wq — <t(Lo) = (t{Rq) with a{Xi) ^ 1 ioi 1 <i < d and po — ha — hoa. 
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We have \wo\ = rriQ and exp(iyo) G 2'^('^+"'°s"). All maximal free intervals 
have length exactly one, i.e., every positive interval [a,f3] with (3 — a > 1 
contains an implicit cut. 

It is because of the last sentence that we have worked out the details about 
free intervals. This difficulty is due to the constraints. Without them the 
reasoning would have been much simpler. But the good news are that from 
now on, the presence of constraints will not interfere very much. 

Example 32 Following Example ^ we use the same equation aXXa = 
YbYabY and we consider the solution wq. 

The new solution is defined by replacing in wq each factor be by a new letter 
d which represents a maximal free interval. The new Wq has the form 

1_3_4 6_7 8 9 _11 12 

WQ=\a\dd\b\ad\d\a\b\dd\ a \ . 
Now all maximal intervals have length one. 

13 Critical Words and Blocks 

In the following I denotes an integer which varies between 1 and rriQ. For 
each I we define the set of critical words Ci by 

Ci = {wq[^ — + 1],Wq[^ + l,-^ — I] I 7 is a cut and£ < 7 < mo — £ }. 

We have 1 < \C(\ <2d — A and Ce is closed under involution. Each word u G 
Cg has length 2£, it can be written in the form u = uiU2 with \ui\ = \u2\ = L 
Then Ui (resp. U2) appears as a suffix, left of some cut and U2 (resp. Hi) 
appears as a prefix, right of the same cut. 

A triple {u,w,v) G ({1} U T^) x r+ x ({1} U T^) is called a block if first, 
up to a possible prefix or suffix no other factor of the word uwv is a critical 
word, second, w 7^ 1 if and only if a prefix of uwv of length 2£ belongs to 
Ci, and third, t; 7^ 1 if and only if a suffix of uwv of length 2£ belongs 
to Cg. The set of blocks is denoted by Bg^. It is viewed (as a possibly 
infinite) alphabet where the involution is defined by {u.,w,v) = {v,w,u). 
We can define a homomorphism ni : —>■ T* by ni{u,w,v) = w G F"*" 
being extended to a projection 7i£ : {Bf U F)* F* by leaving F invariant. 
We define he : {Bi U F) ^ M hj hi = hne. In the following we shall 
consider finite subsets F^ C 5^ U F which are closed under involution. Then 
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by TT^ : r| — > r* and /z-^ : — > M we understand the restrictions of the 

respective homomorphisms. 

For every non-empty word w G F"*" we define its i-factorization as follows. 
We write 

Fi{w) = {ui,wi, vi)--- {uk, Wk, Vk) e Bj 

such that w = wi - ■ -Wk and for 1 < i < A; the following conditions are 
satisfied: 

• Vi is a prefix of w^+i ■ ■ - Wk, 

• Vi — 1 ii and only ii i — k, 

• is a suffix of Wi • • • tUj-i, 

• = 1 if and only if i = 1. 

Note that the i'-factorization of a word w is unique. For A; > 2 we have \wi\ > 
£ and \ wk\ > i, but all other Wi may be short. If no critical word appears as a 
factor of w, then Fi{w) = (1, w, 1). In particular, this is the case for |w| < 2£. 
If we have w = puvq with \u\ = \v\ — £ and uv e C^, then there is a unique 
i e {1, . . . , A; — 1} such that u — Wj+i, v — Vi, and pu — Wi- • -Wi, vq — 
Wj+i ■ ■ - Wk- Thus, Fii^w) contains a factor {ui,Wi,v)(u,Wi^i,Vi^i) where v is 
a prefix of Wj+iWj+i and u is a suffix of UiWi. For example, the ^-factorization 
of uv e Ci with |m| = Itil = £ is 

Fi{uv) = V, 1). 

We define the head, body, and tail of a word w based on its ^-factorization 

Fe{w) = wi,vi) ■ ■ ■ {uk, Wk, Vk) 

in and F* as follows: 

Readiiw) = {ui,wi,vi) e B^, 

hea.de{w) = wi G F"*", 

Body^lw) = (^2,^2,^2) • ■ • {uk-i,Wk-i,Vk-i) e 

body^(i(;) = W2 ■ ■ ■ Wk-i e T* , 

Tail£(«;) = {uk,Wk,Vk) & Be, 

ta,i\i{w) ^ Wk ^ F+. 
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For k>2 {in particular, if hody^{w) ^ 1) we have 

FiiiyS) = llcadi{w)Bodjg{w)Taili{w), 
w = hea.di{w)hodj^{w)ta.ili{w). 

Moreover, U2 is a suffix of wi and Vk-i is a prefix of Wk- 
Assume hodj^{w) ^ 1 and let u.v & T* be any words. Then we can view 
w in the context uwv and Bod'yg{w) appears as a proper factor in the i- 
factorization of uwv. More precisely, let 

Fe{uwv) = {ui,wi,vi) ■ ■ ■{uk,Wk,Vk). 

Then there are unique 1 < p < q < k such that: 

Fi{uwv) = {ui,wi,vi) ■ ■ ■ {up,Wp,Vp)Body^{w){ug,Wg,Vg) ■ ■ ■ {uk,Wk,Vk), 
Wi - • -Wp — u head£(w) , 
Wq---Wk = tsa[e(w)v 

Finally, we note that the above definitions are compatible with the in\xjlution. 
We have Fe{w) — F^{w), Head^(t(;) = Tail^(t(;), and Body^(t(;) = Body^(i(;). 

14 The Transformation 

Our equation with constraints is Eq = {T, h,QQ, Pq; Xi ■ ■ ■ Xg = Xg^i ■ ■ ■ Xd)- 
We start with the ^-factorization of Wq = a{xi ■ ■ ■ Xg) = a{xg+i ■ ■ ■ xa). Let 

Fe{wo) = {ui,wi,vi) ■ ■ ■ {uk,Wk,Vk). 

A sequence 5' = {up,Wp,Vp)---{uq,Wq,Vq) with l<p<q<kis called 
an i-factor. We say that 5 is a cover of a positive interval [q;,/3], if both 
1^1 • • -Wp^il < a and \ wg+i ■ ■ - Wkl <mQ — [3. Thus, wofo;, /3] becomes a factor 
oi Wp - ■ ■ Wq. It is a minimal cover, if neither (■Up+i, Wp+i, Vp+i) ■ ■ ■ {uq, Wg, Vg) 
nor {up,Wp,Vp) ■ ■ ■ {ug_i,Wg_i,Vg_i) is a cover of [q;,/3]. The minimal cover 
exists and it is unique. 

We let = { X e Qo I body£((T(X)) 7^ 1 }, and we are going to define a new 
left-hand side Li G {Bi U fli)* and a new right-hand side Ri G {Bi U fie)*. 
For Li we consider those 1 < i < g where body^(o'(xj)) 7^ 1. Note that this 
implies Xi G fli since £ > 1 and then the body of a constant is always empty. 
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Recall the definition of l(i) and r(i), and define a = + |liead£(a(:r,;))| 
and f3 = r(z) — |tail^(o'(xj))|. Then we have iwoic^)/^] = body^(o"(xj)). Next 
consider the i?-factor Si = {up,Wp,Vp)---{ug,Wq,Vg) which is the minimal 
cover of [a, (3]. Then we have 1 < p < q < k and Wp - ■ -Wq = wo[a,P] = 
hody^{a{xi)). The definition of Si depends only on Xi, but not on the choice 
of the index i. 

We replace the ^-factor Si in F£{wq) by the variable Xj. Having done this 
for all 1 < i < g with body^(o'(a:j)) 7^ 1 we obtain the left-hand side G 
{Be U Qe)* of the ^-transformation Ei. For Re we proceed analogously by 
replacing those ^-factors Si where hodye{(j{xi)) 7^ 1 and g + 1 < i < d. 
For Ei we cannot use the alphabet B^, because it might be too large or even 
infinite. Therefore we let F^/ be the smallest subset of B^ which is closed 
under involution and which satisfies L^Ri G (F^/ U Qe)*. We let F^ = F^/ U F. 
The projection tt^ : F| — >■ F* and the mapping he : Te ^ M are defined by 
the restriction of ire : Be ^ F*, 7re{u,w,v) — w and he{u,w,v) = h{w) G M 
and by 7r^(a) = a and he{a) — h{a) for a G F. 

Finally, we define the mapping pe : Vte ^ M hy Pi{X) — /i(body^((T(X))). 
This completes the definition of the ^-transformation: 

Ee — (F^, he, Q^, pe; Le — Re)- 

Remcirk 33 One can verify that oe : ^e ^ <yi{X) = (/?^(Body^((T(X))) 
defines a solution of Ee, where ipe is the identity on Te and Tie on Be \ F^/. 
Although, up to the trivial case i = mo, we make no explicit use of this fact. 

Example 34 We continue with our example aXXa = YbYabY and the 
solution a which has been given by 

WQ=\a\dd\b\ad\d\d\b\dd\a\, 

where the bars show the cuts. 

Up to involution, the set Ci is given by {ad., bd, ab, dd} and C2 is given by 
{ddba,dbad, adda, ddab} . The 1-factorization of wq can be obtained letter by 
letter. The 2-factorization of wq is given by the following sequence: 

(1, add, ba){dd, b, ad){db, ad, da){ad, d, ab){dd, a, bd){da, b, dd){ab, dda, 1). 

Recall (j{X) = ddbad and (j(Y) = add. Hence their 2-factorizations are 
{l,add,ba){dd,b,ad){db,ad,l) and {l,add,l), respectively. 
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By renaming letters, the 2-factorization of wq becomes abcdeba and the equa- 
tion E reduces to E2 : aXcdeXa = abcdeba since the body of a(Y) is empty. 
The reader can check that the 3- factorization of wq after renaming is the very 
same word as the 2-factorization, but the 3-factorization of a{X) is now one 
letter, (1, ddbad, 1), so becomes a trivial equation. Plandowski's algorithm 
will return true at this stage. 

Remcirk 35 i) In the extreme case i — thq, the i-transformation becomes 
trivial. Let a — {l,Wo, 1). Then a — {l,Wo, 1) and = {0^,0} U F. More- 
over, we have = Rmo = ci, and hmoia) = h{vuo) G M. Since = 0? 
the equation with constraints Em^ has trivially a solution. It is clear that Emg 
is a node in the search graph, and if we reach Emo, then the algorithm will 
return true. 

a) The other eoctreme case is i — 1. The situation again is simple, but 
the precise definition is technically more involved. Consider a block {u,w,v) 
which appears in Fi{wo). Then w = vuo[a,j3] for some (3 — a > 1. We 
cannot have f3 — a > 2, because then [a, /3] would have an implicit cut j, but 
Wol-j — 1,7 + 1] G Ci and no critical word is a factor of w. An immediate 
consequence is |Fi| < (|F| + 1)^ g 0{d^). Let X G Qq. Then Body^{a{X)) ^ 
1 if and only if \a{X)\ > 3. Thus, for X & Q,i we have cr{X) — bcu — vde 
with 6, c, 0?, e G F and w, f G F+. It follows: 

Fi{a{X)) = (l,6,c)(6,c,V2) • • •(M|^|+i,d,e)((^,e, 1). 

For example, for |f | = 1 this means b = c= d, and V2 = e. 

We can describe Li G Tl as follows: 

For 1 < i < g let Wi — cr(xi) and ai the last letter of a{xi_i) if i > 1 and 
ai — 1. Let fi the first letter of a{xi+i) if i < g and fg — 1. Let hi the first 

letter of Wi and Ci the last letter of Wi. 

For \wi\ = 1 we replace Xi by the 1-factor (ai, bi, fi). 

For \wi\ = 2 we replace Xi by the 1-factor {ai, bi, ei){bi, e^, fi). 

For \wi\>'i we let Ci be the second letter of Wi and di its second last. In this 

case we replace Xi by {ai,bi,Ci)xi{di,ei, fi). 

The definition of Ri is analogous. Thus, we obtain < 3|Loi?o| = 3(i, 

and El is admissible. We also see that there was an overestimation of the 
size of \Ti\. For each Xi we need at most two constants together with their 
involutions. Since Fi contains also F, we obtain |Fi| < 6d. 
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By the remark above, Ei and are admissible and hence nodes of the 
search graph. The goal is to reach via Ei when starting with £"0- For 
the moment it is even not clear that the ^^-transformations with 1 < i < nio 
belongs to the search graph. We prove this statement in the next section. 

15 The ^-transformation Ei is admissible 

Proposition 36 There is a polynomial p (of degree at most 4 ) such that 
each E£ is admissible for all £ > 1 . 

Proof. It is enough to show that Li and can be represented by exponential 
expressions of size Old"^ {d +n log n)). Then Te can have size at most 0{d^{d+ 
n log n)) and the assertion follows. We will estimate the size of an exponential 
expression for L^, only. 
We start again with the ^-transformation of 

^^(wo) = Wi, Vi)--- {Uk, Wk, Vk). 

If k is small there is nothing to do since [L^l < |F^(wo)|. An easy reflection 
shows that \Lg\ can become large, only if there is some 1 < i < g such 
that hea.de{a{xi)) or ta.ile{a{xi)) is long. By symmetry we treat the case 
head^((T(xi)) only and we fix some notation. We let 1 < i < ^r, a = l(z), and 
P — a + |head£((7(a;j))|. Let 

{Up-i, Wp-i, • ■ ■ {Uy+i,Wq+i, Vq+i) 

be a minimal cover of [a, We may assume that q — p is large. It is enough 
to find an exponential expression for the ^-factor 

{Up,Wp,Vp) ■ ■ ■ {Ug,Wq,Vg) 

having size in 0{d{d + nlogn)), because we want the whole expression to 

have size in 0{d'^{d + nlogn)). 

Note that Wp ■■■ Wg is a. proper factor of hea.di{a{xi)). Hence no critical word 
of Ci can appear as a factor inside Wp---Wq. This means there is some 
P ^ s < q such that both \wp ■ ■ ■ Ws-\\ < (. and \ws+i ■ ■ ■ Wq\ < £. Indeed, 
if \wp- ■ -Wg-il < i, then we choose s — q. Otherwise we let p < s < g 
be minimal such that \wp - ■ -Wsl > £■ Then \ws+i ■ ■ - Wgl > £ is impossible 
because Us+iVg G Ci would appear as a factor in Wp ■■■ Wg. We can write 

{up, Wp, Vp)--- {uq, Wg, Vg) = Siius, Ws, wJS'a; 
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and since {ug, Ws,Vs) G is a letter, it is enough to find exponential expres- 
sions for Si, i = 1,2, of size 0{d{d + nlogn)) each. As a conclusion it is 
enough to prove the following lemma. □ 
The statement of the next lemma is slightly more general as we need it above. 
There we need the lemma for c = 1, but later we will apply the lemma with 
values c < 32d. 



Lemma 37 Let c > be a number and 



S = {ui,wi,vi) ■ ■ ■ {uk, Wk, Vk) e Bl 

be a sequence which appears as some i-factor in F^{wq). If we have k < 3 or 
\w2---Wk-i\ < ci, then we can represent the sequence by some exponential 
expression of size 0{cd{d + nlogn)). 

Proof. We show that there is an exponential expression of size 0{d{d + 
nlogn)) under the assumption \wi - ■ -Wkl < L This is enough, because we 
always can write S as aoSiai ■ ■ ■ Sc'Oc', where c' < c, the a, are letters, and 
each Si satisfies the assumption. Note that the assumption implies Ui ^ 
1 7^ Vk and we may define Uk+i as the suffix of length i of uiWi ■ ■ ■ Wk- For 
I < i < k let Zi = Ui+iVi. Then Zi G is a critical word which appears 
as a factor in 2; = U1W1W2 ■ ■ ■ WkVk- If the words Zi, 1 < i < k are pairwise 
different, then k — 1 < \Ci\ E 0{d) and we are done. Hence we may assume 
that there are repetitions. Let j be the smallest index such that a critical 
word is seen for the second time and let i < j he the first appearance of zj. 
This means for 1 < i < j the words zi, - ■ ■ , zj-i are pairwise different and 
Zi = Zj. Now, ■ ■ - Wkl < i and \zi\ = 2i, hence Zi and Zj overlap in z. We 
can choose r maximal such that UiWi ■ ■ ■ Wi{wi^i ■ ■ ■ WjYvj is a prefix of the 
word z. (Note that the last factor vj insures that the prefix ends with zj). 
For some index s > j we can write 

Z = UiWi ■ ■ ■ Wi{Wi+i ■ ■ ■ WjY'Ws ■ ■ ■ WkVk- 

We claim that Zi ^ {zg, . . . ,Zk}. Indeed, let t be maximal such that Zi = zt 
and assume that j 7^ t. Then both \wi+i ■ ■ - wjl and \wj+i ■ ■ -Wtl are periods 
of Zi, but \wi+i---Wt\ < \z\. Hence by Fine and Wilf's Theorem [16| we 
obtain that the greatest common divisor of \wi^i ■ ■ - Wjl and ■ ■ - Wtl is 
a period, too. Due to the definition of an ^-factorization {zj was the first 
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repetition) the length • ■ -Wtl is therefore a multiple of jwj+i ■ ■ - Wjl and 
we must have t — s — 1. This shows the claim. Moreover, we have 

{ui,Wi,Vi) ■ ■ ■ {Uk,Wk,Vk) 
= {ui,Wi, Vi)--- {Ui, Wi, Vi)[{Ui+i,Wi+i,Vi+i) ■ ■ ■ {Uj, Wj, Vj)Y S' 

where 5" = {us,Ws,Vs) ■ ■ ■ {uk,Wk,Vk) for s = i + 1 + r{j — i). We have 
r < exp(wo), hence r G 2*^('^+"''°s"'). It follows that 

{ui,Wi,Vi) ■ ■ ■ {Ui,Wi,Vi)[{Ui+i,Wi+i,Vi+i) ■ ■ ■ {Uj,Wj,Vj)Y 

is an exponential expression of size j+log(r) e 0{d+n\ogn). More precisely, 
for some suitable constant c its size is at most c{d + n\ogn). The constant 
c depends only on the constant which is hidden when writing cxp(';L'o) G 
20(rf+niogn)^ By induction on the size of the set {zi, . . . , z^} wc may assume 
that S' = {us,Ws,Vs) ■ ■ ■ {uk, Wk, Vk) has an exponential expression of size at 
most \{zs, . . . , Zk}\c{d + n). Hence the exponential expression for S has size 
at most 

c{d + nlogn) + \{zs, . . . , Zk}\c{d + nlogn) < \{zi, . . . , Zk}\c{d + nlogn). 

Thus, the size is in 0{d{d -\- nlogn)). □ 
At this stage we know that all ^^-transformations arc admissible (with respect 
to some suitable polynomial of degree 4). Thus Ei, . . . , are nodes of the 
search graph. Next we show that the search graph contains arcs Eq Ei 
and Ee — > E^/ ior 1 < £ < i' < 2i. Hence the graph contains a path (of 
logarithmic length in mo) from Eq to Em„- The non-deterministic procedure 
is able to find this path and on input Eq Plandowski's algorithm gives the 
correct answer. 

In order to establish the existence of arcs from E^ to E^i ior < £ < i' < 
max{l, 2£} we shall define intermediate equations Ei^^i such that there is an 
admissible base change /3, a projection tt, and a partial solution 5 with 

6,{7r*{Ee)) = Ei,e' = P*{Ee'). 

16 The arc from Eq to Ei 

Recall the definition of Ei = (Fi, hi, Qi, pi, Li = Ri). The letters of Fi can 
be written either as (a, b, c) or as b with a, c e F U {1} and 6 e F. We define 
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a projection which is used here as a base change /? : Fi ^ F by j3{a, b,c) — b 
and leaving the letters of F invariant. Clearly, hi = hf], and /3 defines an 
admissible base change. Define £"0.1 = f^*{Ei). Then we have Lo,i = (3{Li) 
and i?o,i = where /3 : (Fi U Qi)* ^ (F U fli)* is the extension with 

P{X) = X for all X eQi.We have Fq,! = F 

It is now obvious how to define the partial solution 5 : Qq ^ TfliT U F* 
such that 6,{Eo) = £o,i- If 1^(^)1 < 2, then we let 6{X) = a{X). For 
> 3 we write a{X) = auh with a, 6 G F and u G F+. Then we have 
X & VLi = r2o,i and we define S{X) = aXb and po,i(^) = h{u). For X E fli 
we have pi{X) = h{hodji{a{X))) , hence po,i = Pi, too. This shows that, 
indeed, S^{Eo) — P^{Ei). Formally, we can write this as 5*(7r*(£^o)) — P*{Ei), 
where tt is the identity. Hence there is an arc from Eg to Ei. 

17 The equations E^f for 1 < £ < f < 2£ 

In this section we define for each 1 2£ an intermediate equation 

with constraints 

P*{Ee') — Ei^^i — (F^_^/, hi^i>,flii,pii; L^^^i — R^i) 

by some base change /3 : F^/ — > (5^ U F)*, then we show that (3 is admissible. 
Recall F C F^/ C 5^/ U F. The base change (3 leaves the letters of F invariant. 
Consider some (it, w, v) G F£'\F. It is enough to define /3(m, w, v) or [3{v, w, u). 
Hence we may assume that {u, w, v) appears as a letter in the ^'-factorization 
Ff^wo). Therefore we find a positive interval [a,P] such that w = Wo[a,(3] 
and such that the following two conditions are satisfied: 

1) We have u — 1 and a — or \u\ — i', a > i', and u — wo[a — a\. 

2) We have v = 1 and /3 — mo or \v\ = i' , f3 < mo — i', and v — wo[P, P + i']- 
Let {up, Wp, Vp) ■ ■ ■ {uq, Wq, Vq) hc thc ^-factor which is the minimal cover of 
[a,P] with respect to the ^-factorization Fi{wo). Since £ < £' we have 
Wp - ■ - Wq — w. Moreover, the word Up is a suffix of u and Vg is a prefix 
of V. We define 

P{U, W, V) = {Up, Wp, Vp) ■ ■■ {Uq, Wq, Vq) G . 

The definition does not depend on the choice of [a, P] as long as < a < /? < 
mo and 1) and 2) are satisfied. We have j3{u, w, v) = j3{v, w, u) and h^P = hf. 
Now let r^^ii C U F be the smallest subset such that /3(F^/) C F|_^,. Then 
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Ti^g! contains F and it is closed under involution (since F^/ has this property). 
A crucial, but easy reflection shows that F^ C F^^^/. This will become essential 
later. 

We view /3 as a homomorphism /? : F|, — T}^, and define Ee^' = (3^{Ei/). 
Let us show that (3 defines an admissible base change. Since is already 
known to be admissible with respect to some polynomial of degree 4, it is 
enough to find some admissible exponential expression (again with respect 
to some polynomial of degree 4) for the ^-factor 

P{U, W, V) = {Up, Wp, Vp)--- {Uq, Wq, Vq) 

where {u, w, v) G F^/ \ F. We use the same notations as above. Thus, for 
some positive interval [a, (3] we have Wp---Wq = Wo[a,P], the word m is a 
suffix of tyo[0,a], and f is a prefix of wo[(3,mQ]. If g — p is small, there is 
nothing to do. By Lemma we may also assume that (3 — a > 32di. We 
are to define inductively a sequence of positions 

a = ao < ai < ■ ■ ■ < ai < ■ ■ ■ < j3i < ■ ■ ■ < Pi < Po = f3. 

Each time we let Wi = Wo[«j,A]- Thus, Wq = Wp---Wq. Assume that 
Wi = wo[ai,j3i] is already defined such that Pi — ai > 2. The interval 
is not free. Hence, there is some implicit cut 7^ with ai < < (3i. The 
word Wi is a factor of w, hence no factor of Wi belongs to the set of critical 
words Ci'. This implies Pi — ji < i' or ji — ai < i'. If we have Pi — ji < i' 
then we let aj+i = ai and = 7,. In the other case we let ttj+i = ji and 
Pi+i = Pi. Thus Wi+i is defined such that Wi+i is a proper factor of Wi with 
\W^\-\Wi+i\<i'. 

We need some additional book keeping. We define G {l,r} by = r if 
Pi = Pi+i and Tj = 1 otherwise (i.e., ttj = Oj+i). Furthermore the imphcit cut 
ji corresponds to some real cut j'^ and a- < 7^' < P^ such that Wi = Wo[a'^, P'^] 
or Wi = wo[P'i, a'j]. We define Si G {+, — } by Sj = + if Wi = wo[«i, A'] 
Si = — otherwise (in particular, Si = — implies Wi = Wo[a[,P'j]). The triple 
(7-,rj,Sj) is denoted by 7(«). There are at most 4((i — 2) such triples and 
7(i) is defined whenever Wi+i is defined. We stop the induction procedure 
after the first repetition. Thus we find < i < j < Ad such that ^{i) = 
We obtain a sequence Wq, Wi, ... ,Wi, . . . ,Wj where each word is a proper 
factor of the preceding one. We have \Wo\ — \ Wj\ < Mi' < 8di and due to 
I Wo I > 32d£ the sequence above really exists, moreover \Wj\ > 8di. 
Next, we show that Wj has a non-trivial overlap with itself. We treat the 
case 7(z) = 7(j) = (7,r,+) only. The other three cases (7,r,— ), (7,1,+), 
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and (7, 1, — ) can be treated analogously. For some a' < 7 < /9' we have 
Wi = Wo[a', P'] and VFj+i = u'o[7, /?']. Thus, for some 7 < /i < z/ < we have 
Wj = wolfi, u] and we can assume that /i — 7 < {j — < Adi' — i' < 8di — i'. 
On the other hand we have = (7, r, +), too. Hence for some /i' < 7 < i/' 
with 7 — /i' < we have Wj = wo[fi', u'], too. Therefore < /i — /i' < 8d£ 
and Wj has some non-trivial overlap. We can write Wj = W^W' such that 
l<\W\<m and W' is a prefix of W. 

Putting everything together, we arrive in all cases at a factorization Wq = 
UW^V with e < exp(wo), 1 < < Ml, and \U\ + \V\ < lQd£. However, we 
have not finished yet. Recall that we are looking for an admissible exponential 
expression for 

P{U,W,V) = {Up,Wp,Vp) ■ ■ ■ {Ug,Wg,Vg). 

Due to \Wo\ > i we can choose r minimal, p < r < q + 1, and s maximal 
p — 1 < s < q such that \wp ■ ■ ■ Wr-i\ > \U\ + i and \ws+i ■ ■ ■ Wg\ > \V\ + i. 
By Lemma we may assume r < s and it is enough to find an exponential 
expression for 

S = {Ur, Wr-, Vr) " " " (Ms, Wg, Vg). 

Note that the word UrWrWr+i ■ ■ ■ WsVg is a factor of W^. Again, we may 
assume that WrWr+i ■ ■ ■ Wg > 32di. By switching to some conjugated word 
W' if necessary, we may assume that UrWrWr+i ■ ■ -WgVs is a prefix of W^. 
Moreover, by symmetry we may choose a positive interval [a, /3] such that 
t(7o[Q;,/3] = UrWrWr+i- ■ ■ WsVg. Clearly, we have Wo[i,j] = Wo[i + \ W\, j + \ W\] 
for all a < z < j < — \W\. In particular, the critical word Wo[a,a + 2i] 
appears as wo[a + \ W\,a + \ W\ + 2i] again. This means that there is some 
r <t < s such that \wr---Wt\ = \W\. More precisely, we can choose r <t < 
t' < s and a maximal e' < e such that 

S = {{Ur,Wr,Vr) " " " {Ut,Wt,Vt)y {Uf , Wf , Vf) ■ ■ ■ {Us,Ws,Vs). 

Since it holds e' < exp{wo), \wr---Wt\ = \W\, and \wt'---Ws\ < \W\, the 
existence of an admissible exponential expression for P{u, w, v) follows. Hence 
(3 is an admissible base change. 

18 Passing from Ei to E^f ior 1 < i < £' < 2£ 

In the final step we have to show that there exists some projection vr : 
r^£, and some partial solution S : Qi ^ T'^i,fli'T^^g, U F^^^, such that 

6^{n*{Ei)) = Ei^i'. We don't have to care about admissibility anymore. 
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For the projection we have to consider a letter in \ F^. Such a letter has 
the form [u, w, v) € Bi and we may define Tr{u, w,v) = w since F C F^. 
Clearly 7i{{u,w,v)) = 7i{u,w,v) and hi^i'{u,w,v) = he'{u,w,v) = h{w) = 
hi{7r{u,w,v)) are verified. Thus vr : F^^, defines a projection such that 

'^*{E() = {Ti^i/,hi^£i,Qi, Pi, Li = Re). 

We have to define a partial solution 5 : VLi ^ F^^/fi^/F^^, U F^^, such that 
5{Le) = l3{Lii) and 5{Ri) = l3{Rfi). For this, we have to consider a variable 
X G with ho(bfi{a{X)) ^ 1. By symmetry, we may assume that X = Xi 
for some l<i < g. Hence (t{X) = wo[\{i),Y{i)]. 

Let a = l(z) + |head£(a(X))| and P = r(i) - \taik{a{X))\. Then l(i) + £ < 
a < P < r(i) —i. Let {up, Wp, Vp) ■ ■ ■ {ug, Wg, Vg) be the minimal cover of [a, /3] 
with respect to the ^-factorization. We have Wp ■ ■ ■ Wg = body^(o"(X)). 
For body£,(X) = 1 we have X & Qi\ Qi' and we define 

S{X) = {Up, Wp, Vp) - ■■ {Ug, Wg, Vg). 

Then 5{X) e B} and hi5{X) = pe{X) since pi{X) = /i(body^(a(X))). It is 
also clear that the definition does not depend on the choice of i, and we have 
6(X) = Six). 

Recall the definition of Li>. Since hodjit{a{X)) = 1, there is a factor /i ■ ■ ■ /r 
of Lei which belongs to F^, and /i ■ ■ ■ /r covers [a,P] with respect to the i'- 
factorization Fi>{wq). It follows that 6{X) is a factor of ■ ■ ■ fr), hence 
S{X) G F^^, by definition of F^ ^/. 

For hodje,{X) 7^ 1 we have X G Qi' and we find positions p < v such that 
p = \{i) + |head£'(a(X))| and u = r(z) - |tail£'(cr(X))|. 

For some p<r<s<qwe have WQ[a,p] = Wp - ■ -Wr-i, ifofi^, /?] = 
Ws+i ■ ■ ■ Wg, and body£/(cr(X)) = Wr- ■ ■ Wg. We define 

5{X) = {Up,Wp,Vp) ■ ■ ■ {Ur-l,Wr-l,Vr-l)X{Us+l,Ws+l,Vs+l) " • ' {Ug,Wg,Vg). 

As above, we can verify that S{X) = UXV with U,V & F^^, such that 
5(X) = VXU and pi{X) = hi^e{U)pe{X)he^e{V). Finally, 5{Le) = L^ and 
5{Re) = Ri'. Hence S^,{7i*{Ei)) = /3^{Ee'). This proves Theorem ^ 

19 Conclusion 

In this paper we were dealing with the existential theory, only. For free groups 
it is also known that the positive theory without constraints is decidable. 
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see Thus, one can allow also a mixture of existential and universal 

quantifiers, if there are no negations at all. Since a negation can be replaced 
with the help of an extra variable and some positive rational constraint, 
one might be tempted to prove that the positive theory of equations with 
rational constraints in free groups is decidable. But such a program must 
fail: Indeed, by [^| and [0] it is known that the positive V3'^-theory of word 
equations is unsolvable. Since S* is a rational subset of the free group F{T,), 
this theory can be encoded in the positive theory of equations with rational 
constraints in free groups, and the later is undecidable, too. On the other 
hand, a negation leads to a positive constraint of a very restricted type, so 
the interesting question remains under which type of constraints the positive 
theory becomes decidable. 
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